<img src="http://nci.org.au/wp-content/themes/nci/img/img-logo-large.png", width=400>

Analysing and visualising data from multiple sources in QGIS

In this notebook:

  • How to use QGIS (http://qgis.org) as a tool for interacting with data using web data services and on the VDI filesystem.
  • Obtaining data from a Web Coverage Service (WCS)
  • Obtaining data from a web-based vector data service
  • Obtaining data from the NCI file system using the NetCDF Operators (NCO)
  • Using QGIS to fuse data and create a shareable, interactive 3D map rendered using webGL in a browser

We are going to make an interactive version of the map below, starting with data from three different sources merged using QGIS. We will also see some key differences between data access methods.

The following material uses Geoscience Australia's Elevation Data Collection which is available under the Create Commons License 4.0 through NCI's THREDDS Data Server. For more information on the collection and licensing, please click here. Also used is ANU Water and Landscape Dynamics tree cover product, available under the Create Commons License 4.0 through NCI's THREDDS Data Server. For more information on the collection and licensing, please click here.

Introduction and setup

1. QGIS - what and why?

a. What is QGIS?

QGIS (Quantum GIS) is an application for geospatial data analysis and visualisation, focussed on 2D (raster and vector) datasets, and wrapped in a straightforward graphical user interface. It can also be used as a Python library, but for this session we use the graphical interface.

b. I have [insert many other tools] on the VDI - why use QGIS?

QGIS can be used on the VDI as a rapid-prototyping tool. If you've got all the infrastructure you need available in your project space and don't need it, that's great!

We hope to demonstrate some ways which QGIS can help you quickly develop shareable, multi-source analyses quickly and simply.

c. What is the general approach of this exercise?

Imagine you've just run a model or synthesised some new analytical data and want to quickly compare your results with other data sources, inspect the results, and share with colleagues - essentially create a visualisation you can demonstrate.

2. Define a question

For this session the question we want to address is:

How are the 'hilliness' of suburban areas and and 'tree cover' in Canberra related? ...and do hillier sections have more or less tree cover than flatter sections?

3. Choose useful datasets

To answer our question we need some data on topography, tree cover and section boundaries in the ACT.

From NCI's collections we can use:

  • A Shuttle Radar Topography Mission raster DEM at 1 arc-second resolution
  • Some tree cover indices from the ANU Water and Landscape Dynamics collection: /g/data2/ub8/

NCI doesn't hold cadastral data, so we'll get those from the local land administrator:

  • ACT cadastral data outlining sections (held outside NCI)

Note - here, sections are collections of 50 or so individual house blocks. Our elevation and tree cover data are too coarse for individual house blocks - and the vector data file for blocks is massive!

These data will come in three different formats:

  1. GeoTIFF (constructed from a NetCDF file using WCS)
  2. NetCDF4
  3. Vector data as GeoJSON, which we will save as Geopackage or ESRI shapefile for use in QGIS

4. Choose data manipulation strategies

The use case for this exercise is to quickly merge prototype data (eg model output, analysis results) with relevant geospatial data for initial analysis and sharing. We will use three different approaches to obtain data for this map:

  1. We will obtain elevation data from the SRTM national mosaic using WCS. The assumption here is that we can find data via the NCI service catalogues more easily than on the file system.
  2. We will obtain tree cover data from a NetCDF file held on the NCI file system. We assume that this is your own data, that may not be published anywhere yet.
  3. For the cadastral data, we will request vector data not held at NCI using a web service.

The WCS request uses only a web browser. For filesystem data we use the NetCDF operators (NCO, http://nco.sourceforge.net/) to subset and manipulate our data. And for external web service requests we will use QGIS native capabilities to ingest data.


Further reading about the data we are using

Elevation

We use an SRTM elevation mosaic, held in Geoscience Australia's elevation reference collection:

  • THREDDS: http://dapds00.nci.org.au/thredds/catalog/rr1/Elevation/NetCDF/catalog.html
  • Geonetwork: https://geonetwork.nci.org.au/geonetwork/srv/eng/catalog.search#/metadata/f9082_1236_9859_8989
  • Filesystem: /g/data1/rr1/Elevation/NetCDF/

Licensed under Creative Commons (CCBY4): http://dapds00.nci.org.au/thredds/fileServer/licenses/rr1_licence.pdf

Tree cover

These data come from the ANU Water and Landscape Dynamics collection:

  • THREDDS: http://dapds00.nci.org.au/thredds/catalog/ub8/au/catalog.html
  • Geonetwork: http://geonetwork.nci.org.au/geonetwork/srv/eng/catalog.search#/metadata/f1104_2906_7276_3004
  • Filesystem: /g/data1/ub8/au/treecover/

Licensed under Creative Commons (CCBY4): http://dapds00.nci.org.au/thredds/fileServer/licenses/ub8_licence.txt

Cadastral data from ACTMAPi

Data used to define cadastral sections in the ACT are obtained from the ACT Government's ACTMAPi system (http://www.actmapi.act.gov.au). Data are released under a creative commons license (CCBY4): https://creativecommons.org/licenses/by/4.0/

The section data landing page is here: http://actmapi.actgov.opendata.arcgis.com/datasets/eedcda7873934e789093b093521b0299_3 ...where you can obtain the data via API links or direct download (for example, as a shapefile).

A GeoJSON version of the data can be downloaded at this link: http://actmapi.actgov.opendata.arcgis.com/datasets/eedcda7873934e789093b093521b0299_3

Setup and data wrangling

1. Create some working spaces

We need two directories to work with here - one to hold data and your QGIS project, one to hold the resulting visualisation. Open a terminal window and:

$ mkdir ./qgis_vis_data

...to hold some data subsets we will create, then:

$ mkdir ./qgis_vis_map

...to hold our finished map. We will use these throughout the session.

2. Get topography data using Web Coverage Services

For topography we need a terrain model in a raster format, e.g. GeoTIFF, which covers only the Canberra region.

Clicking the link below will result in your browser asking to download a file named WCSxxxxxxxxx.tif, where xxxxxxxxx is a series of numbers:

http://dapds00.nci.org.au/thredds/wcs/rr1/Elevation/NetCDF/1secSRTM_DEMs_v1.0/DEM/Elevation_1secSRTM_DEMs_v1.0_DEM_Mosaic_dem1sv1_0.nc?service=WCS&version=1.0.0&request=GetCoverage&Coverage=elevation&bbox=148.7,-35.8,149.5,-35.1&format=GeoTIFF_Float

Save the WCSxxxxxxxxx.tif file, then move it from your Downloads directory it to your qgis_vis_data directory and rename it to something memorable, for example act_dem.tif

What happened here? We used the THREDDS web coverage service (WCS) to clip a small chunk from the NetCDF elevation mosaic and give us data as a GeoTIFF.

A quick summary of how the request is constructed is below:

  • the path to the data: http://dapds00.nci.org.au/thredds/wcs/rr1/Elevation/NetCDF/1secSRTM_DEMs_v1.0/DEM/
  • the dataset: Elevation_1secSRTM_DEMs_v1.0_DEM_Mosaic_dem1sv1_0.nc
  • the service: service=WCS
  • the service version: version=1.0.0
  • the thing we want to do (get a coverage): request=GetCoverage
  • the coverage (or layer) we want to get: Coverage=elevation
  • the boundary of the layer we want: bbox=148.7,-35.8,149.5,-35.1
  • the format we want to get our coverage as: format=GeoTIFF_Float

Note the use of GeoTIFF_Float - required for THREDDS to return a GeoTIFF with pixels holding data ranges. Other OWS providers (eg Geoserver) might not require the _Float suffix

3. Acquire tree cover data from the file system

Here we pretend that you have some shiny new data on the NCI filesystem that isn't published yet, and may need some wrangling to make it workable with other datasets, but you want to show it off and compare it with other datasets.

The treecover netCDF file we are interested in here covers all of Australia - but we don't want to try and load the whole country for this task. Let's stick to our region of interest, and use the NetCDF Operators (NCO - http://nco.sourceforge.net/) to grab the part we need.

In a new terminal move to your qgis_vis_data directory and load NCO:

$ cd ~/qgis_vis_data
$ module load nco/4.5.3

Use the NCO utility ncks to clip a region from the ANU WALD TreeCover dataset for 2015 from /g/data1/ub8/:

$ ncks -v TreeCover -d latitude,-35.8,-35.1 -d longitude,148.7,149.5  /g/data1/ub8/au/treecover/ANUWALD.TreeCover.25m.2015.nc ./treecover_2015_-35.8-35.1_148.7_149.5_original.nc

QGIS will panic if you attempt to load this particular netCDF subset, because it misinterprets the order of latitude and longitude and tries to place the image near 149 north, -35 east. If you're interested, try loading the file and see!

A quick call to another NCO utility fixes that - here we use ncpdq to swap the coordinate axis order in the TreeCover variable for our subset:

$ ncpdq -v TreeCover -a latitude,longitude treecover_2015_-35.8-35.1_148.7_149.5_original.nc treecover_2015_-35.8-35.1_148.7_149.5_latlonswap.nc

Use ncdump -h on both the original and modified files to see what changed. Loading this subset into QGIS will be much quicker, and also place the data in it's correct location.


Note

Rearranging dimensions to load netCDF data into QGIS is not always necessary. For example, try adding this gravity map as a raster layer (instructions below) - it meets NetCDF-CF, and loads without any modification:
/g/data/rr2/National_Coverages/onshore_Bouguer_offshore_Freeair_gravity_geodetic_June_2009/ onshore_Bouguer_offshore_Freeair_gravity_geodetic_June_2009.nc

Here, we demonstrate a method of quickly viewing parts of your output data - even if they are not yet fully QC'ed and need a little massaging to get going.

4. Find and obtain cadastral data

ACT publish a lot of spatial data on their ACTMAPi interface. To shortcut, this link will download cadastral section data as a GeoJSON file:

http://actmapi.actgov.opendata.arcgis.com/datasets/eedcda7873934e789093b093521b0299_3.geojson

Move the resulting 'ACT_Sections.geojson' file to your qgis_vis_data directory.


The ACT section data description is here:

http://actmapi.actgov.opendata.arcgis.com/datasets/eedcda7873934e789093b093521b0299_3

You can download a shapefile and import it to QGIS, or select a service from the 'API' menu box. The GeoJSON link used here is copied from the API list. If you're steaming ahead, explore some other services available here.

Making your map in QGIS

1. Start QGIS on the VDI

Like most VDI applications, we use the module load system to acquire QGIS. In a terminal window type:

module purge
module load proj/4.8.0 qgis/2.14.8-py2.7
  • proj.4 is required for coordinate transformations used when building our 3D visualisation
  • QGIS is our main tool for this worksheet
qgis &

This will start QGIS. You can safely ignore warnings in this terminal window - minimise it.

Just briefly we'll introduce some terms referred to often:

2. Import elevation and tree cover layers into QGIS

Follow the screenshot directions here to import your raster DEM into QGIS:

If you are asked to select a coordinate reference system, use WGS84 / EPSG:4326

Repeat this step with your modified tree cover NetCDF file - load it into QGIS as a raster layer

If you are asked to select a coordinate reference system, use WGS84 / EPSG:4326

After this step, your tree cover and elevation layers should cover exactly the same region - assuming you've used the same geographic subset boundaries for both datasets.

3. Style the tree cover layer

So far we have a bunch of grey things - let's make them colourful!

We don't need to worry about the elevation data, but we do need a useful colour scheme for vegetation data and our cadastral data. Double click your vegetation layer in the layers panel to bring up it's properties dialogue. From there:

  1. Click 'style' in the left panel (dark background)
  2. In the 'render type' dropdown, choose 'Singleband pseudocolor'
  3. Leave interpolation as 'linear', and pick a colour palette. Because the data are continuous, a single hue continuous colour palette makes sense. Because it's vegetation, green also makes sense. Choose what makes sense to you.
  4. Choose 'equal interval' in the 'Mode' dropdown (try others, see what happens)
  5. Click 'classify' to show the palette in the big window, and apply with with 'apply'.
  6. Click 'close' to return to your map.

Your map window should look something like this:

Extension:

Steaming ahead? Add a graticule to your QGIS map layer...

How could you automagically save WCS data with sane, memorable names?

Hints:

Getting lost? right click any layer in the layers panel, and choose zoom to layer to reset your view to that layer in the map window.

Q & A

Why not just use the QGIS OWS browser to grab these data?

  1. The QGIS WCS browser and THREDDS don't play well - QGIS attenuates the full THREDDS WCS URL, so it is far more convenient to form a request independently of QGIS.
  2. OWS layers can't be used for processing - ie band math, and the visualisation tools we're about to use.
  3. QGIS's OWS engine would attempt to load the full dataset - which is far more than we need.

If you're keen, do some exploring - we don't know *everything* about QGIS - surprise us! More about QGIS OWS capabilities can be found here: http://docs.qgis.org/2.14/en/docs/user_manual/working_with_ogc/index.html

Why not get these data from the filesystem?

Great question! The main reason is that we're demonstrating QGIS as a way to fuse data from manifold sources. Over today and tomorrow you'll see a log examples of how to collect data from the filesystem in manageable chunks - which you could then analyse/visualise using the methods shown here. Demonstrating web coverage services on the VDI shows how you can pull data from many external sources to help interpret your work.

For more on working with raster data, see: http://docs.qgis.org/2.14/en/docs/user_manual/working_with_raster/index.html

4. Import vector data from GeoJSON

Follow the screenshot directions here to import the GeoJSON file you saved earlier into QGIS:

If you are asked to select a coordinate reference system, use WGS84 / EPSG:4326

...and here is the result:

Extension:

Could we open web-based vector files another way in QGIS? What are two alternate methods using the QGIS interface?

For more on working with vector data in QGIS, see: http://docs.qgis.org/2.14/en/docs/user_manual/working_with_vector/index.html

5. Install two useful QGIS plugins

To procees we need to install two QGIS plugins - Qgis2threejs and Zonal statistics :

  1. Head to the plugins menu and choose manage and install plugins
  2. In the resulting dialogue box, search for Qgis2threejs
  3. Select the Qgis2threejs plugin and click install.

The Zonal Statistics plugin is installed by default but needs activating. Search for Zonal statistics in the plugin manager, and check the box next to the plugin name. Then click close and you're done.

6. Using zonal statistics for basic analysis

We are aiming to show:

  • Standard deviaton of elevation of blocks as a proxy for hilliness; and
  • The mean treecover of each section.

The Zonal statistics plugin collects data from raster layers using polygons in a vector layer, and computes basic statistics for the chunk of the raster layer inside each polygon.

a. Section hilliness

As a proxy for section hilliness, we'll use the standard deviation of elevation in each section polygon.

Head to raster -> zonal stats to open the plugin.

In the zonal statistics plugin dialogue, choose the DEM as the raster layer, and use band 1. Then choose your ACT blocks vector layer. In the statistics to calculate, pick an appropriate set - but include standard deviation - this is our roughness proxy. Add a meaningful prefix to the statistics (e.g. 'dem_'), so you can find them when you need to use them.

QGIS will spend some time calculating stats for each block, and add the output to the vector layer (act_sections) as another attribute column.

b. Tree cover per section

For tree cover, each grid cell/pixel shows an estimate of tree cover as a percentage of the pixel. As a quick measure we could take the mean of all the pixel values inside a section to get an idea of it's tree coverage.

Open the zonal statistics plugin again, but choose the tree cover as the raster layer and use band 1. Then choose your ACT sections vector layer. In the statistics to calculate, the values we need are preselected - we'll use the mean value for each section. Again, use a meaningful prefix (for example, 'tree_') for the attribute you are adding to the act_sections dataset.

c. Use mean tree cover per section to style our vector layer

We avoided styling our vector layer earlier, but now it's time - since we want to visualise the tree cover in each section.

Double click your act_sections layer to open it's properties dialogue, and head to the style tab. Here, we want to apply a good looking colour scale based on mean tree cover (the data we just generated). Set up as follows, so that your style dialogue looks like the screenshot below:

  • Choose 'graduated' from the menu at top left
  • In 'column' choose 'treemean' (assuming you used tree as a prefix in the last step)
  • Choose a colour ramp
  • Click 'classify'
  • Click 'Apply' to preview or 'OK' to commit your changes

Feel free to tinker with other settings - you can always remove and reload the layer to start over

After you've clicked OK, right click your act_sections layer and select zoom to layer to take a closer look at the results.

We already have a result!

So far we can visualise the tree cover of sections in the ACT - but how can we relate that quickly and easily to section hillines? And how could we interactively explore the relationship?

Question:
Is a graphical GIS the only way to collect zonal statistics using a vector layer to segment raster data?

7. Using a simple 3D visualisation to complete the picture

So far we've imported three different datasets into QGIS and created some new attributes on our vector dataset. We've styled our vector dataset to tell us something about treecover in each section. Here, we show how to visualise 'hilliness' alongside those data and make an interactive, explorable visualisation.

In the scheme we're about to set up:

  • If hillier blocks are more vegetated, dark green blocks will visualise as taller columns; and
  • If hillier blocks are less vegetated, light green blocks will visualise as taller columns Lets test it out!

Here we use the second plugin - Qgis2threejs. This renders the current screen to a WebGL map in a .html page using three.js - with some neat data visualisation features. You can open the result as an interactive map in a web browser.

Note: Follow all of the data import steps shown above before heading into this section.

a. Setting up - coordinate transformation in QGIS

So far we've worked in a WGS84 (EPSG:4326) coordinate system - but in order to render our map in three.js, we need to project our data into something which has units of metres, not degrees. Let's choose GDA94/MGA55 (EPSG:28355):

  1. Locate the panel at the lower right of the QGIS window which says 'EPSG:4326'
  2. Click it to open a CRS selection dialog
  3. Select 'Enable 'on the fly' CRS transformation (OTF)' at the top left
  4. Enter 'MGA 55' in the 'filter' box, then highlight GDA94 /MGA 55 in the 'coordinate systems of the world...' box. It should show up in the 'selected CRS' panel.
  5. Click OK.

You'll see that everything has warped a touch, and your CRS panel (lower right) reflects your choice. Proj.4 handles all the rest for you!


QGIS sits on top of the Proj.4 library - meaning that any of the transformation capabilities offered by Proj.4 are available. Read more here: http://docs.qgis.org/2.14/en/docs/user_manual/working_with_projections/working_with_projections.html

Note

QGIS uses both layer coordinate reference systems and a 'project wide' coordinate reference system. To use on-the-fly reprojection for the whole map you're constructing, ensure that each layer is loaded in it's appropriate CRS. For an example, you'd specify a layer CRS of EPSG:3577 for an AGDC LandSAT tile, and use OTF reprojection to use it with another layer referenced in EPSG:4326

b. Set a clipping frame

Qgis2threejs attempts to render the whole map window. We want to limit or map to the extents of our DEM - so we can either zoom the map window in so that our region of interest occupies the whole window, or set a clipping polygon in a new vector layer. Here, we zoom in so that our region of interest fills the map window.

The easiest way to achieve this is zooming to the extents of the ACT sections layer you've made. Right click the layer name, and select zoom to layer. You may need to use the hand (from the upper row of icons in your QGIS window) to move your map around so that your map window is completely filled, and the magnifying glass or mouse scroll wheel to adjust the zoom level. You should end with something like the second screenshot in section 10 (above).


The next steps render the entire map window in a WebGL-based 3D viewer - it's OK to leave some empty space around your region of interest, but it works better to completely fill your map window with relevant data.

c. Setting up in Qgis2threejs

Head to web -> Qgis2threejs to open the plugin dialogue. Click 'world' in the left pane, here we define basic parameters about the map we're creating. It's usually prettier to apply some vertical exaggeration in Australia, try between 1.5 and 5. Using 5 gives Canberra a pretty alpine feel.

Next click 'DEM'. Here you select the SRTM data you grabbed via WCS as the dem to build terrain. You can set an image to be draped on the DEM - map window view, a specific layer, an image or a plain colour. This example uses the DEM layer to colour the terrain map. Also turn shading on to get pretty hillshading.

Finally check the radio button next to our ACT sections layer in the left panel. Here we set up some visualisation parameters for our vector layer (which contains the section-level treecover statistics we generated). Set the Z coordinate to'Absolute value', and enter 580 m. We're going to compare the heights of extruded polygons, so we should start them all in the same place!

Keep the colour as 'feature style', set 'Transparency' to 20-30 and finally, choose a data source to detemine extruded polygon heights. We finally relate the hilliness of sections to tree cover here - choose dem_stdev, and a multiplier (10 works well in this example).

d. show the map!

We finally use our qgis_vis_map directory - browse to it at the 'output html file path' box and use a memorable name to save your output html file. Then click run. All being well firefox will open and display an interactive map like the one below! See section 13 for links to prebuilt versions hosted at github.io.

If the map doesn't pop up right away navigate to the place you just stored the .html file, and open it in a browser.

If firefox complains about being unresponsive, tell it to wait. If if asks about an unresponsive script, tell it to continue

If there are a lot of users on the VDI, exploring your map might be a little slow

8. So what do our results mean? Interpreting our picture

If hilly blocks have generally more tree cover than flatter blocks, short columns should be lighter shades of blue (in this colour scheme).

...but that's not necesarily what we see! Why not?

Extension activities

  1. Which suburbs have the most trees? which is the hilliest? * **hint** - use another QGIS plugin, and another web mapping data source - and see the example in section 9*
  2. Why do some hilly sections have very little treecover?
  3. Create a totally new interactive 3D model using the stack we've just been working with, and give us a URL to see your work!
  4. Use some other data collections at NCI to display data on your DEM
  5. Visualise the same result a different way - do we *really* need qgis2threejs?

9. Sharing your map

The map you just created using QGIS2threejs is a fully stand-alone webGL application that can be dropped into any web server to share with collaborators.

You can't serve data directly from the VDI - but you can copy the whole qgis_vis_map folder to some web host (e.g. github.io) and share with the world. Here's an example:

https://adamsteer.github.io/nci_samples/qgis2threejs/treecover.html

Here's another example with an OpenStreetMap layer rendered on the DEM:

https://adamsteer.github.io/nci_samples/qgis2threejs/treecover-osm.html

What are some other ways to share QGIS projects?

Possible solutions to questions

Sane WCS coverage request names

One option is to use curl on the command line:

curl -o SRTM_dem_139_36.tiff 'http://dapds00.nci.org.au/thredds/wcs/rr1/Elevation/NetCDF/1secSRTM_DEMs_v1.0/DEM/Elevation_1secSRTM_DEMs_v1.0_DEM_Mosaic_dem1sv1_0.nc?service=WCS&version=1.0.0&request=GetCoverage&Coverage=elevation&bbox=148.7,-35.8,149.5,-35.1&format=GeoTIFF_Float'

...are there others? What was yours?

Zonal statistics right in a notebook

You could also try rasterstats: https://github.com/perrygeo/python-rasterstats - any other suggestions?

Other ways to share your QGIS projects, and other 3D visualisers

For simple projects with CCBY4 data, the QGIS cloud is one way of sharing your results. Another is a Jupyter notebook hosted on github as a gist, a notebook, or as a github.io page. What was your approach?

Further reading

Here is a great tutorial on netCDF and QGIS: http://www.ggiuliani.ch/download/netcdf_qgis_GG.pdf - covering a lot more than we did here, and useful for working with data directly on the VDI filesystem.

Also, inspect the result of your work - what is Three.js doing? What can you apply from this example to other work? The Qgis2threejs plugin can only add so much - how could you render additional datasets in the same map?

Summary

Here we:

  • Used a graphical GIS available on the VDI
  • Used OGC web services available at NCI
  • Demonstrated pulling data from external web services
  • Demonstrated shaping data in the filesystem for merging with web service data
  • Used these data to perform a basic analysis
  • Learned one method of visualising and sharing results

Thanks! Discussion and suggestions welcome.


In [ ]: